Dataset: https://www.kaggle.com/datasets/rtatman/historical-american-lynching

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0      ✔ purrr   0.3.5 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(leaflet)
library(dplyr)
library(plotly)
## 
## Vedhæfter pakke: 'plotly'
## 
## Det følgende objekt er maskeret fra 'package:ggplot2':
## 
##     last_plot
## 
## Det følgende objekt er maskeret fra 'package:stats':
## 
##     filter
## 
## Det følgende objekt er maskeret fra 'package:graphics':
## 
##     layout
library(ggplot2)
Lynching_data <- read_csv("data/Lynching-Data-Simone-Fixed-xlsx (1).csv")
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 2806 Columns: 36
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (11): State, Day, Victim, County, Race, Sex, Mob, Offense, Note, 2nd Nam...
## dbl  (4): Year, Mo, Latitude, Longitude
## lgl (21): Column, Comments, Column2, Column3, Column4, Column5, Column6, Col...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

To sort the data:

First I had to count how many got linched by race each of the datasets years. And this was done by using the count() function from the dplyr library.I was shown this way to do it by Adela in Hackyhours.

Vis=data.frame(Lynching_data %>% 
  group_by(Year, Race) %>% 
  count())

I then wanted to create my first visulization, and it was done with the help of this guide: https://r-graph-gallery.com/136-stacked-area-chart.html

Here I chose to show the killings by race through a geom_area() graph.

vis_1 <- ggplot(Vis, aes(x=Year,y=n, fill=Race))+
         geom_area()+
  labs(title = "Killed by race", 
       y= "Deaths per year",
       x= "Year")
ggplotly(vis_1)

This graph is not as easy to read as I had hoped it was. So I decided to create a line graph insted.

And for some reason I do not know why, the ggplottly will only show 1 year, and whould not be fixed no matter what I tried.

Line praph

This line graph was created with the help of https://r-graph-gallery.com/line-chart-several-groups-ggplot2.html

The n value is the number of people killed in that year.

vis_1_1 <- ggplot(Vis, aes(x=Year,y=n, fill=Race, color=Race))+
         geom_line()+
  labs(title = "Killed by race", 
       y= "Deaths per year",
       x= "Year")
ggplotly(vis_1_1)

Another usefull thing about this grapph is that you can hover over the year and get the precise number of deaths and year. Which is usefull as the graph is also not easy to read, by just looking at it. I could not figure out how to stretch the x and y-axes to make it easier to read. That is probebly the biggest flaw with this graph.

The split of the graph is from the homicide homework assignment:

vis_1_1_1 <- ggplot(Vis, aes(x=Year,y=n, fill=Race, color=Race))+
         geom_line()+
  labs(title = "Killed by race", 
       y= "Deaths per year",
       x= "Year")+
      facet_wrap(~ Race, ncol = 1)
ggplotly(vis_1_1_1)

##This is usefull to show how big the difference was between the race of the people that got lynched. And look at the evolution of people killed through time, but more on that in the final project.

Reasons for the lynchings

This was done by using this guide: https://r-graph-gallery.com/267-reorder-a-variable-in-ggplot2.html

Vis_Offense=data.frame(Lynching_data %>% 
  group_by(Offense, Race) %>% 
  count())
library(forcats)
Vis_Offense %>% 
  ggplot( aes(x=n, y=Offense, fill = Race)) +
    geom_bar(stat="identity", fill="#f68060", alpha=.5, width=.7) +
    xlab("") +
  labs(title = "Offence reasons, all races.", 
       y= "Reason for lynching",
       x= "Number of lynchings")+
    theme_bw()

This is clearly not ledgeble, so I tried to filter the data.

The first one here is a try to filter it so it only shows black victims, this was done with the help of this guide: https://www.youtube.com/watch?v=sDoGnfL0vzg&ab_channel=MikeJonasEconometrics

Vis_Offense1=filter(Vis_Offense, Race!='Wht')
Vis_Offense1=filter(Vis_Offense1, Race!='Unk')
Vis_Offense1=filter(Vis_Offense1, Race!='Other')
Vis_Offense1 %>% 
  ggplot( aes(x=n, y=Offense, fill = Race)) +
    geom_bar(stat="identity", fill="#f68060", alpha=.5, width=.7) +
    xlab("") +
  labs(title = "Offence reasons for black victims.", 
       y= "Reason for lynching",
       x= "Number of lynchings")+
    theme_bw()

This shows there are simply to many datapoints still. But I still wanted to try and do the same with the white victims.

Vis_Offense2=filter(Vis_Offense, Race!='Blk')
Vis_Offense2=filter(Vis_Offense2, Race!='Unk')
Vis_Offense2=filter(Vis_Offense2, Race!='Other')
Vis_Offense2 %>% 
  ggplot( aes(x=n, y=Offense, fill = Race)) +
    geom_bar(stat="identity", fill="#f68060", alpha=.5, width=.7) +
    xlab("") +
  labs(title = "Offence reasons for white victims.", 
       y= "Reason for lynching",
       x= "Number of lynchings")+
    theme_bw()

This is readeble in the big window. And shows ecatly what I wanted it to. The only problem is that I could not get it to add more numbers on the x-axis.